The influence of task difficulty, social tolerance and model success on social learning in Barbary macaques

Despite playing a pivotal role in the inception of animal culture studies, macaque social learning is surprisingly understudied. Social learning is important to survival and influenced by dominance and affiliation in social animals. Individuals generally rely on social learning when individual learning is costly, and selectively use social learning strategies influencing what is learned and from whom. Here, we combined social learning experiments, using extractive foraging tasks, with network-based diffusion analysis (using various social relationships) to investigate the transmission of social information in free-ranging Barbary macaques. We also investigated the influence of task difficulty on reliance on social information and evidence for social learning strategies. Social learning was detected for the most difficult tasks only, with huddling relations outside task introductions, and observation networks during task introductions, predicting social transmission. For the most difficult task only, individuals appeared to employ a social learning strategy of copying the most successful demonstrator observed. Results indicate that high social tolerance represents social learning opportunities and influences social learning processes. The reliance of Barbary macaques on social learning, and cues of model-success supports the costly information hypothesis. Our study provides more statistical evidence to the previous claims indicative of culture in macaques.


S2 -Social rank analysis
Data on agonistic encounters were collected using behavioural measures based on agonistic competitions and formal dominance to calculate dominance ranks [1]. Agonistic competitions are referred as dyadic interactions where one subject (the winner) directs an agonistic behaviour (e.g. hit, slap, bite, threat) towards another subject (the loser) who displays a submissive behaviour (e.g. flee, silent-bared teeth display, submissive grin, see Table 1 in Main Text). These agonistic encounters are characterized by the asymmetry of the outcome (i.e. win or lose) [2]. On the other hand, lack of aggressiveness/agonism refers to those instances where conflicts are resolved using non-agonistic assessments (i.e. submissive behaviour) and escalated fights do not take place [2]. In these contexts, it is assumed that one subject (the subordinate) has learned from previous encounters with a conspecific or recognizes some dominance features on its opponent that bias its behaviour towards fleeing-upon-approach responses or submission/yielding when receiving threats [3]. Therefore, the subordinate recognizes its inferior position, so the dominance relationship is readily accepted instead of agonistically challenged. This dominance context is termed 'formal dominance' [1].
Dominance ranks can be calculated using different statistical methods. Here, we followed the recommendations of Funkhauser et al. (2018) [1] for calculating dominance ranks using different approaches, correlate the scores obtained across methods and, if there are no significant differences, calculate median ranks across all these ranking procedures, which minimizes errors and takes conservative interpretations of dominance hierarchy with minimal data.
Following Funkhauser et al. (2018) [1], we measured dominance ranks using five ranking methods: a) I&SI method, a non-parametric method that uses a numerical criterion that is maximized or minimized in the re-organization of a data matrix of dominance relations (i.e. dominance matrix, b) David's scores, that calculates the individual overall success and ranks the subjects in order according to this measure, c) Elo-ratings, a non-matrix based technique to determine dominance ranks assuming linearity [4], d) ADAGIO [5] and e) PERC [6], two different methods that analyse dominance without making structural assumptions of the hierarchy (i.e. network-based methods). More information about the hierarchy methods used in this study can be found in Table S2. 1 and in Funkhauser et al. (2018) [1]. Finally, we used Spearman rank correlations with Benjamini-Hochberg corrections to determine the reliability across rankings provided by the different methods.

Description
The method is based in the re-organization of individuals assuming a linear hierarchy by minimizing the number of inconsistencies (I) and the total strength of inconsistencies (SI) in a matrix of dominance relations [7][8] [9].

Calculations
DomiCalc was used, which is a series of Excel macros (the script is the same applied in R by Leiva et al., 2010 [10] using the ISI.method function). In contrast to the R function, DomiCalc provides all the alternative optimal ranking solutions and uses the differences between numbers of dominations and subordinations (Dom-Sub) and the proportion of dominations (PD) in the last step of the procedure to break ties and decide the final ranking order (optimization of the method) [9]. Linearity was measured using the improved Landau h' test with the linear.hierarchy.test function in R.

Description
The method derives a dominance index based on the overall success of each individual and the relative strenght of its opponents. It calculates the proportion of wins over losses in agonsitic encounters relative to the total number of observed interactions and corrected for chance occurrences of observed outcomes [11] [12].

Calculations
The R function steeptest (package 'steepness') was used, which calculates steepness and derives David's scores using the improved algorithm with the correction of chance probabilities suggested by Gammell et al. (2003) [12]. Normalized and non-normalized David's scores were estimated. Steepness of a dominance hierarchy refers to the size of the absolute differences between adjacently ranked individuals and their overall success in winning dominance encounters. Steepness is the absolute slope of the straight line fitted to the normalized David's scores plotted against individual ranks [13].

Description
The method provides sequential estimations of individual dominance strengths based on the actual sequence of dominance interactions. It is based on the assumption that the chance of individual A winning B is a function of the difference in current ratings of the two contestants. Therefore, the method takes into account the sequence of interactions and updates the rating of each individual after each contest until the last contest observed to provide the final ranking scores [4] [14].

Calculations
The R package 'EloOptimized' was used to calculate Elo-ratings using the traditional method (eloratingfixed function) and the optimized method (eloratingopt function). The traditional method considers the parameter k (which determines the number of rating points that an individual wins or loses after each encounter) as constant and assigns the same initial Elo-scores to all the individuals (1000 by default). The optimized method uses the maximum likelihood approach to calculate the values of k that better fit the data and the initial Elo-ratings (using AIC measures of model fit). Rank stability was measured using the ratio of rank changes per individuals present over a given time period [15]. The stability index is formally expressed as: where is the sum of absolute differences between rankings of two consecutive days, is a weighting factor determined as the standardized Elo-rating of the highest-ranking individual involved in a rank change, and is the number of individuals present on both days.

Description
The method is a network-based ranking model. It attempts to determine dominance ranks of individuals in a group under the assumption that the hierarchy structure may not be linear, so dominance relations are not completely transitive, meaning that if A dominates B and B dominates C, A does not necessarily dominates C [6]. The method follows two main steps. First, a series of matrices are calculated based on pairwise interactions plus transitive dominance inferred from interactions with common third-parties in order to estimate dominance potential probabilities. Then, individuals are assigned a rank according to the final matrix calculated in step one and a simulated annealing algorithm (see [6]) is used to minimize the number of inconsistencies to provide a final rank (confident bounds of individual ranks are derived).

Calculations
The R package 'Perc' and the guidelines developed by Fushing & McCowan labs were used. Calculations were based on a matrix that combines information from direct win/loss interactions with information from indirect pathways between individuals to calculate a matrix of probabilities where each row individual outranks the column individual. The analysis provides heat maps of the individual ranks to identify non-linear dominance structures and takes into account the uncertainty of the data due to potential intransitivities. The annealing algorithm seeks to minimize the costs of potential inconsistencies due to these intransitivities by re-ordering the matrix (with values of dominant subjects above the diagonal and values of subordinate subjects below the diagonal) so that A total of 828 agonistic encounters were used in the hierarchy analyses. Linearity was low and significant (h' = 0.112, p<0.001), as well as steepness (stp = 0.049, p<0.001, Figure   Only one optimal ranking order was obtained for the I&SI method (I: 6, SI: 62). Elo-ratings were calculated using the optimized algorithm with a burn-in period of 100 interactions. On the first 100 interactions, only information of social rank for 31 out of 56 individuals was available (55% of the group). Therefore, we calculated stability from the moment information on dominance relations for all individuals was available until the end of the study (a total of 45 days). Results indicate that the hierarchy is somehow stable but changes in dominance relations are not rare (rank differences = 1165, S = 0.397). This may be due to the sample size, since the group is large and the complexity of the environment they live in may have hindered the observation and occurrence of agonistic encounters.
PERC resulted in 642 transitive triangles, 11 intransitive triangles and a transitivity value of 0.98, indicating that dominance relations were linear. Since some intransitivities were present, some optimal solutions of the PERC simulations have greater costs than others (cost range: 54.99 -67.62).
Therefore, the best ranking order is the one provided by the simulation with a lower cost value.
Uncertainty measures indicated that 14.8% of all possible dyads (458 out of 3080) were not clearly defined (i.e. uncertainty probability <0.60, Figure S2.2). Again, the large and complex environment and group size of this group of Barbary macaques makes highly likely that most of the possible dyadic interactions did not take place or were so infrequent that could rarely be observed. This may have caused that some dominance relations were not well established and even changed depending on a different audience at the moment of the encounter. Due to the presence of intransitivities and the small sample size, pre-processing approaches were used in ADAGIO to break symmetries and potential ties of dominance relations at the dyadic level [5]. DCI (DCI=0.963, see Table S2.1) indicated that data mostly contained unidirectional relations, meaning that in most cases, the dominance relationships between individuals were well-defined and there were few cases of tied dominance relations [5]. Both top-down and bottom-up approaches led to the same ranking order.

Correlations
Spearman

S3 -Description of the tasks
Three extractive foraging tasks of increasing difficulty (with raisins used as rewards) were presented to a group of Barbary macaques.

Task 1: Blue/yellow task
The blue/yellow task consisted of a rectangular wooden box 28 (w) x 16 (h) x 16 (d) cm with two option holes in the top (6 x 6 cm), one framed in yellow and another framed in blue ( Figure 1A in Main Text) inspired by a task used by Kendal et al. (2005) [24] with callitrichid monkeys. These colours were chosen because they are equally visible to di and tri-chromatic individuals. Between the two holes, inside the box, two connected pendulum doors hung ( Figure 1A in Main Text). When a monkey introduced its hand inside one of the holes, the pendulum on that side was pushed to the centre of the box causing the other pendulum to covering the other hole. This mechanism prevented both holes from being used simultaneously to retrieve rewards. When necessary, the task was refilled using the two option holes. The task was fixed to the ground using long U-shaped metal anchor stakes.
Colours were used to distinguish both options. Afro-Eurasian primates like Barbary macaques have trichromatic vision [25][26] [27], so they are capable of distinguishing blue from yellow, green or red.
Only a preference for red items has been found in macaque species [28], supporting the foraging hypothesis, that states that trichromatic vision is an adaptation to facilitate visual detection of ripe fruit [29] [30]. In order to prevent colour biases, red was avoided in the tasks.

Task 2: Push/lift-up task
The push/lift-up task was also inspired by a task used by Kendal  the bottom of the box allowed monkeys to manipulate the swing door. The task was refilled through a hole in the back that was covered with a wooden lid screwed to the box. By unscrewing one of the two screws in the lid, the researcher could swing the lid to one side and refill the task. The raisins were placed at the back of the box. The task was attached to a metal cylinder that was already fixed to the ground in the enclosure.

Task 3: Rotating-door task
The rotating-door task was inspired by a task used with wild lemurs [31], and consisted of a squared-wooden box 23 (w) x 23 (h) x 23 (d) cm with a circular retrieval hole (8 cm in diameter) that was covered by a circular door (9.5 cm in diameter) that could be rotated clockwise or counter-clockwise ( Figure 1C in Main Text). By rotating door, monkeys could uncover the hole that gave access to the raisins. Once uncovered, the monkeys could stretch their arms through the retrieval hole to reach the raisins placed inside at the bottom of the box. The task was refilled using the circular retrieval hole. The task was fixed to the ground using long U-shaped metal anchor stakes. Both unweighted (U) and weighted (W) versions of Cohen's kappa were measured. The difference between unweighted and weighted kappa is that weighted kappa incorporates the magnitude of each disagreement and provides partial credit for disagreements when agreement is not complete [34].

S4 -Inter-observer reliability
Sessions coded by CE included missing and additional information on who observed whom that were not available when IG coded the videos. This difference might explain the lowest level of agreement for this variable (Table S4), which does not make the data coded on who observes whom less reliable but more conservative.

Results:
The rate of successful manipulations differed among the three tasks (Kruskal-Wallis χ 2 = 20.585, df = 2, p < 0.001), with a higher rate of successful manipulations in the blue/yellow task than in the push/liftup (Dunn test: p = 0.013) and the rotating-door (Dunn test: p < 0.001) tasks, and in the push/lift task than in the rotating-door task (Dunn test: p = 0.009). The rate of unsuccessful manipulations was significantly different among tasks (Kruskal-Wallis χ 2 = 37.741, df = 2, p-value < 0.001) with higher rates in the blue/yellow task than in the push/lift-up task (Dunn test: p < 0.001) and the rotating-door task (Dunn test: p < 0.001), and no difference between these two latter tasks (Dunn test: p = 0.288). Only the rate of successful task manipulations indicates that task difficulty varied as anticipated (see

Discussion).
For all tasks, individuals performed significantly more successful than unsuccessful manipulations.
Amongst apparent asocial learners, individuals performed more successful than unsuccessful interactions for push/lift-up and rotating-door tasks only (Table S4).

Discussion:
Tasks were designed to be of increasing difficulty: blue/yellow task (low), push/lift-up task (medium), rotating-door task (high). Outcomes confirmed that the rotating-door task required more learning time and, therefore, was more difficult than the push/lift-up and blue/yellow tasks, as expected. Also, the rate of successful manipulations indicated that blue/yellow task was the easiest task and rotating-door task was the most difficult of the three. These results were inconsistent with those obtained when the rate of unsuccessful manipulations was considered, for which the blue/yellow task appeared to be more difficult than the other tasks. However, introductions of the blue/yellow task presented a series of flaws that affected the successful retrieval of raisins but not the manipulation of the actions (consisting in stretching one hand through a hole, a component that was also present in the other tasks). These problems included: a) an internal mechanism of pendulum doors that hindered the extraction of rewards, b) the impossibility to know when the task was empty, causing monkeys to attempt to solve it when success was not possible, c) a period of habituation to extractive foraging tasks, a novel context for this group of monkeys. All these issues, that most likely increased the number of unsuccessful manipulations and latency to first success, were solved and/or not observed in push/lift-up and rotating-door tasks.
Finally, the blue/yellow task was identical to the round-box task used by Kendal et al. (2009) [35] with callitrichids, the easiest task tested in their study, in which they used a more difficult task (flip-top box) similar to our push/lift-up task. In conclusion, we determine that blue/yellow task was the easiest task of this study, push/lift-up task was of medium difficulty and rotating-door was the most difficult task tested.

S6 -Multinomial analysis and comparative analysis of threshold percentages for option preference
We tested whether individuals showed a preference for one of the two available solving-options in each task using an exact multinomial analysis. The analysis was based on considering that individuals showed a preference for one option if the number of times they used that option (calculated as the percentage of use of that option) was above a threshold. We used multinomial analysis to test option preferences using different percentages of use to establish that threshold value. Options used a % of times above the threshold were considered the individual's task-option preference. If none of the options was used a % of times above the threshold considered in each case, individuals were considered to have 'No preference' for any of the task options available. Results of the exact multinomial tests can be seen in Tables S6.1 to S6.3.     In general, results showed that except for 50% (an extremely optimistic threshold) in the blue/yellow task, outcomes were not overly sensitive to which preference threshold criteria was used. Although option preferences for each category did not differ among the % thresholds tested for the rotatingdoor task (see Table S6. 3  Asocial: Purely asocial learning model. Social: Asocial + social learning model. ΔAICc: Difference in AIC between asocial and asocial+social learning models. Support: The degree to which the agent-based model (asocial or social) with the lowest AIC is better than the alternative. For example, in the first line the asocial learning model for grooming was 4.41x better than the corresponding asocial + social learning model for this network. Approach: Additive (Add), Multiplicative (Multi) or Unconstrained (Unc) model. Rate of transmission: Constant (Con), Nonconstant (Non). When models provided the same results using different approaches and rates, LRT and CI95% were calculated for those with better estimates of the s' parameter (underlined). * indicates models that provide evidence of social transmission according to ΔAICc (bold and italics (dark grey shading) = enough evidence; only italics (light grey shading) = almost enough evidence).. For interpretation of CI95% for the s' parameter, refer to Table 1 and SI in Hasenjager et al. (2020) [36]. 1 Results of ΔAICc using other baseline rates of acquisition maintaining the approach and rate of transmission of the best model. Due to collinearity, social rank order, contact latency and preferred option were removed from the analyses (OADA and cTADA) for blue/yellow task, and contact level and social rank order were not included in the NBDA for push/lift-up and rotating-door tasks. Since contact level and social rank order are variables that measure similar attributes as contact latency and social rank class, respectively, they were used instead when errors of convergence in the optimization algorithm persisted even after the optimization method used in the regression model was changed.

S7 -cTADA results
By default, NBDA assumes that all individuals perform the target behaviour at a similar rate (constant rate of social transmission). The variable 'rate of performance with the task or social transmission of the trait' ( Table 5 in Main Text) was calculated and included in the NBDA to test for the influence on social diffusion of non-constant rates of social transmission. Regarding asocial acquisition, a constant baseline function considers that there are multiple steps individuals must undergo to learn a novel task, but the time taken to solve each step is unequal. Therefore, the distribution of times to solve (latencies) asocially tends to an exponential distribution. If the average solving time for each step is considered equal, the latencies follow a gamma distribution (non-constant baseline rate). The Weibull distribution is a continuous probability distribution that can fit an extensive range of distribution shapes. Therefore, it is commonly used as a flexible non-constant baseline function. For more information on these functions, see supplementary information in Hasenjager et al. (2021) [36].
For those cTADA models that provided evidence of social transmission using the social networks informed, comparisons with a homogeneous network were conducted to determine whether social transmission followed the provided social network [36]. Results can be found in Table S7 Similar conclusions could be drawn from both OADA and cTADA for the blue/yellow task except in the case of 5m proximity network, where the asocial + social learning model was slightly better than the asocial learning model. However, evidence of social transmission was not sufficient in any case for the blue/yellow task ( Table 6 in Main Text and cTADA results for the rotating-door task showed that the asocial + social learning model was better than the asocial learning model in grooming, kinship and observation networks, as opposed to OADA, where this only was found for observation networks. This was also reflected in the significant p values of the LRT for these networks. However, sufficient evidence of social transmission (ΔAICc > 2) was only found in cTADA models informed with observation networks, as in OADA ( Table 6 in Main Text and Only huddling and observation networks in the push/lift-up task provided more support to social transmission than homogenous networks in cTADA. All networks that provided evidence of social transmission in cTADA for the rotating-door task explained the data better than a homogenous network, indicating that the transmission pathways likely followed the social network provided [36]. Using a different baseline rate of acquisition for the best cTADA models (i.e., maintaining the approach and rate of transmission of the best model) provided, in most cases, opposite results in terms of which agent-based model (asocial or asocial + social) better explained the data. The only exceptions were 5m proximity networks in the blue/yellow task, huddling and both observation networks in the push/liftup and rotating-door tasks, and both proximity networks in the rotating-door task (Table S7.1). When different baseline rates of acquisition were tested along with other combinations of approach and rate of transmission, only huddling and 1m observation networks in the push/lift-up task, and huddling and both proximity networks in the rotating-door task resulted in the same best agent-based model. This suggests that the analysis is dominated by the time course of events as opposed to the pattern of diffusion through the network. In these cases, results of OADA are preferable to those of TADA [36].

S8 -Further discussion on social learning
Hoppitt (2017) empirically demonstrated that observation networks are a direct and powerful way to detect social transmission, even when there is no social structure information or when other networks (e.g. affiliative) cannot provide evidence of social learning [37]. This is because NBDA can provide evidence for social learning if the order in which individuals observe each other follows the order of diffusion [37]. Our results support this statement since networks based on who observed whom during task introductions provided the strongest evidence of social learning.
The effect of number of refills on accelerating learning rates may be due to an increased motivation to interact with the task of those observing rewards being placed inside of it. The slightly faster learning rates for push and counter-clockwise options compared to their alternatives may be due to the fact that lift-up and clockwise actions apparently required the use of, at least, two body parts to retrieve rewards: one to open and hold the door, and one to reach out for raisins. Individuals generally used the same hand to push or move the door counter-clockwise, hold it and retrieve the raisins.
Evidence of social transmission was obtained using the multiplicative approach in all cases, which assumes that a social influence multiplies the chances to learn asocially. Accordingly, social transmission in our group of Barbary macaques likely occurred through indirect social learning processes [38] whereby attention is drawn to the task socially but individuals learn how to solve it for themselves. The fact that frequency of access influenced social transmission in those within 1m (who require higher levels of social tolerance than those at 5m) and that individuals did not seem to copy the actions observed in the push/lift up task, suggests that macaques learned this task asocially motivated by the presence of a demonstrator at task (stimulus/local enhancement or social facilitation) [38]. Accordingly, individuals could have been attracted to the task and/or increased their rate of task exploration due to the mere presence of another individual interacting with it [39].
Close-proximity observations (1m), which led to the strongest evidence of social learning in the most difficult task (rotating-door), allow the transmission of detailed information. The fact that individuals copied the preferred action of the most successful individual observed in the rotating-door task suggests that a transient effect of the action observed might have led Barbary macaques to perform the same successful actions observed [39]. Accordingly, frequency of access to the task might have had no effect on social transmission if individuals had not observed the actions being used and had not had immediate access to the task. We suggest that social learning of the rotating-door task occurred via response facilitation. Here, individuals perform a motor action already in the species behavioural repertoire, or use a familiar action in a novel context [39]. Barbary macaques used the rotating door by pushing it aside (i.e. making the door rotate clockwise or counter-clockwise around a screw; novel context). Pushing is a motor action already in the behavioural repertoire of Barbary macaques [49].
Evidence for response facilitation has also been found in macaques (pig-tailed macaques, Macaca nemestrina, [50]; long-tailed macaques, [51]) and other primate species (great apes, [52]). Response facilitation seems unlikely for the push/lift-up task, as there was no evidence that individuals performed the same rewarding actions observed [38].
Our results have similarities to findings with wild monkeys. In a task similar to our push/lift-up task, there was evidence for indirect social learning (i.e. stimulus and local enhancement) in wild vervets [46], as in the present study. However, the results presented in our study must be taken with caution since data were only collected for a short period in one group of Barbary macaques.